TJOIN v1.02 - join two related data tables

Revised 3-Apr-96. Copyright (c) 1996 by Rune Berg. TextTools Freeware.

Usage - Description - Example - Options - Limitations


USAGE

tjoin [log logfile] [options] [infile1] and infile2 [to outfile] [$i=$j ...]


DESCRIPTION

tjoin prints, to outfile, the join of the tables in infile1 and infile2, optionally using predicates to restrict output.

infile1 and infile2 are ASCII text files. tjoin sees each input line as a row of (by default, but see -i and -a options) whitespace-separated fields; this is described in more detail in the documentation for tcols.

tjoin ignores empty (whitespace only) input lines.

tjoin compares fields the same was as trows does.

Predicates of form $i=$j (where i and j are numbers >= 1) restrict output to the cases where the i'th field in infile1 is equal to the j'th field in infile2.

If you don't specify infile1, tjoin reads from standard input.
If you don't specify outfile, tjoin writes to standard output.
If you don't specify logfile, tjoin writes error messages to standard error.

tjoin holds infile1 in memory while reading infile2, so you may want to specify the smaller input file as infile1.


EXAMPLE

For example, consider the file "boys" containing the table:

      
	john  tennis
	john  golf
	tim   surfing
	al    tennis

and the file "girls" containg the table:

      
	sue   golf
	lisa  tennis

The command:

	tjoin boys and girls

produces the output below, all possible pairs of the data sets from the two files:

	john  tennis   sue   golf
	john  golf     sue   golf
	tim   surfing  sue   golf
	al    tennis   sue   golf
	john  tennis   lisa  tennis
	john  golf     lisa  tennis
	tim   surfing  lisa  tennis
	al    tennis   lisa  tennis

For example, to find sports partners, use the command:

	tjoin boys and girls $2=$2

to produce the output below, all pairs of the data sets from the two files where the second fields are equal:

	john  golf     sue   golf
	john  tennis   lisa  tennis
	al    tennis   lisa  tennis


OPTIONS

-rM : Allow for M (2..8191) rows (lines) in infile1. Default is 500.

-cN : Allow for N (2..100) columns (fields per line) in infile1. Default is 10.

-iC : Separate fields in infile1 by character C (except \). Use \t to form a tab.

-aC : Separate fields in infile2 by character C (except \). Use \t to form a tab.

-oS : Separate output fields by string S, instead of the default tab character. Use \t to form a tab.

-v : Print version banner and usage info to standard error (or logfile, if given), then exit.


LIMITATIONS

The product of M and N (in -r and -c options shown above) must be no more than 16383.

tjoin runs out of memory if infile1 is too large.


End of document